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Summary 


Diurnal variation of rainfalls over the oceana haa long been a '^rcat concern 
of members in the meteorological communi*" In this paper we examined such 
variation using part of data collected from the CARP Atlantic Tropical Experiment 
(CATE), which was conducted in 1974 in the North Atlantic ocean. The data were 
collected from 10,000 grid points arranged as n 100 x 100 array; each grid covered a 
4 square km area. The amount of rainfall was meatnired every 13 minutes during the 
experiment periods using c-band radars. We analyzed data collected in Phases I 
and II of GATE which were conducted between 179th and 197tl> days and between 209th 
snd 227th days of -the year , respectively. 

IVo types of analyses were performed on the data: .4 jalysls of diurnal varia- 
tion was done on each of grid points based on the rainfall averages at noon and 

at midnight, and time series anolysts on selected grid points based on the hourly 

averages of rainfall. 

Since there are no known distribution model which best describes the rainfall 
amount, nonparametric methods were used to examine the dluranl variation. This 
kind of methods was selected because of its model free nature. Kolmogorov- 

Smlrnov test was i sed to test if the rainfalls at noon and at midnight have the 

same statistical distribution. Wilcoxon signed-rank test was used to test if the 
noon rainfall is heavier than, equal to, or lighter than the midnight rainfall. 

These tests were done on each of the 10,000 grid points at which the data are 
a vail. able. 

Among 10,000 grid points, data are not available at 1,872; these grid points 
are around the boundary of tie square covered by GATE. In addition, there are 
1,743 grid points where no conclusion was drawn due to insufficient frequency of 
rain at noon or at midnight during the experiment periods. Both Kolmogorov- 
Smirnov and Wilcoxon slgned-rank tests were conducted at the rest of 6,385 grid 


points. 


ii 


With Kolmogorov*-' u... I rnuv testi It was found that at one out of 10 chance of 
error, the rainfall distributions at noon and at midnight are same at 5,A25 grid 
points and different at 960 grid points. In term of percentage, they are 15.0 and 
85.0%, respectively. This is in contrast to 10 and 90%, respectively, if the as- 
sumption of no difference between noon and midnignt rainfall distributions had been 
true. A chl"«^iuara test showed that this split of percentage was not followed. 
There are much more grid points where the assertion of difference was made than 
expected. Thus, overall the temporal rainfall distribution at noon and at midnight 
are different. 

With Wilcoxon signed-rank test it was found that at one out of 10 chance of 
error, the numbers of grid points at which the noon rainfall Is less than, equal 
to, and more than the midnight rainfall are, respectively, A30, 4,890, and 1,065. 

In term of percentage, they are 6.7, 75.6, and 16.7%, respectively. This is in 
contrast to 10, 60, and 10%, respectively, if the assumption of no difference had 
been true. A chi-square test concluded that the midnight rainfall is not equal 
to the noon rainfall; the noon rainfall is convincingly higher than the midnight 
rainfall. 

Time scries analysis was conducted on some selected grid points in both 
Phases I and II. This analysis is designated to detect if there is a short term 
cycle in the rainfall pattern during the experiment periods and to examine the 
temporal correlational behavior of the rainfall. 

For Phase I, 20 grid points are randomly selected from 8,128 grid points at 
which data are available and for Phase II, 16 grid points are strategically selec- 
ted. For these selected grid points the hourly averages of rainfall are obtained. 
For each of these grid points, we thus obtained a time series consisting of the 
hourly averages of rainfall. There are 36 time series, 20 for Phase 1 and 16 for 


Phase II. 
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Although there were no uniform ehnpe of the nutocorrelation functions of 
tlicse time series, 9 out of 20 In Phase I and 9 out of 16 in Phase II are basically 
negative exponential curves. The values for the auto-correlation decrease rapidly 
as the values for lag increase. The medians of the auto-correlations for these 
series at lags 1, 2, 3 and A are 0.A6, 0.17, 0.15, and 0.10 for Phase I and O.Al, 
0.12, 0.08, and O.OA for Phase II, respectively. The medians at lags 5 or greater 
are not significantly different from 0 for series in both Phases. 

There are short term cycles detected in some of tlie time series. These cycles 
are spar.-se and irregular. For Phase T, a 10-hour cycle is fotmd in one series, 
15-hour cycle in another series and 25-hour cycle in another one. A 12.5-hour cycle 
is found in two series. For Phase II, the cycles found are usually longer; 13-hour 
cycle in two series, 50-hour cycle in 3 series and a 100-hour cycle in one series. 
Thus, short term cycles exist in the oceanic rainfall, but they are not prevail. 

It is interesting to note that the rainfall distribution in Phase II is very 
even among all locations. They vary little from location to location. But In 
Phase 1, the story is different. The value for the mean ranges from 0.0014 cm/hr 
to 0.1320 cm/hr in Phase I; the latter is 94 times of the former. The variation 
in the variance is also dramatic in Phase I. The largest variance is 9,349 times 
of the smallest. For Phase II, it is only 25 times. This phenomenon probably as- 
sociates with the heavy thunder storm activities usually happen in June or July 
during which Phase I of GATH was conducted. 
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Chapter I ; Jbjoctlvos ami Hnckgrouno 


1.1 Introduction 

Many members of the meteorological community have long been concerned with the 
possibility of diurnal rainfall v.irlatlon over the oceans. A comprehensive study 
of the observational evidence for the acceptance of the hypothesis that such a 
variation exists was completed by Jacobson (1976). Gray assumes that this Is the 
case and proceeds to attempt an explanation based on a radlatlonal cooling profile 
theory. Much of the observational data used by Jacobson was based on small IslanJ 
data In the Pacific* 

It should be noted that the work of Jacobson points to the existence of a 
maximum In the morning and a minimum at night; this Is In contrast to a recent 
paper of Welckinan, Long and Hoxlt (1977), where a maximum appears at night and a 
minimum appears In the morning. The Welckman vet al) paper Is based on GATE ship 
measurements over the mid-Atlantic ocean. Assuming tliat Gray's theory explains 
the variation over the Pacific; the Welckman (et ai) study makes it clear that we 
cannot expect the same theor> to apply to the Atlantic. 

Questions immediately arise, both of a theoretical and a methodological nature. 
On the theoretical level, one i;ants to know If the variation Is seasonally depen- 
dent and how; Is it latltudlnally or longitudinally dependent and how; are there any 
significant fluctuations In the variational distribution from year to year in a 
given season; how doc^ this affect our current overall estimates of world rainfall 
rate; how does this affect current sampling plans to estimate the oceanic rainfall 
budget? 

On the methodological level, there have been grumblings concerning the possible 
belief in any of the reports on rainfall variations over the oceans. This Is due 
to many factors; the lack of quality control over data collected from both small 
Islands and ships; the lack of any reasonable distribution of reliable data col- 
lection sources over a given region; the use of unreliable and/or untested data 
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mnnlpulatlon methods and statistical procedures. 

The purpose of this report Is to present n comprehensive unbiased analysis 
of the question of the existence or non-existence of a diurnal rainfall variation 
over the Eastern Atlantic . Our study Is based on use of validated radar data 
from a dense network of ships in the F.astem Atlantic during phases I and II of 
GATE. In order Lo clearly define the scope of our study, the specific research 
objectives were ! 

1. To determine if diurnal rainfall variations exist and determine when 
maxima and minima occur. 

II. To identify and analyze periodic behavior in oceanic rainfall. 

III. To develope a map showing those areas whore rainfall variations exist. 

Because of criticisms of previous efforts, we attempted to constrain our 
approach methodologically in such a manner as to minimize technical objections 
to our conclusions. This lead us to the following specific metho do logical ob- 
je ctives . 

1. To determine if there was any mathematical difference in the empirical 
distributions of rainfall at mo’-ning verses evening. 

II. To determine if there was any statistical difference between the mean 
hourly rainfall rates in the mornings verses evenings. 

III. To determine if tliere was any temporal correlation between rainfall in 
morning and evening. 

1.2 Background 

The GARP Atlantic Tropical Experiment (GATE) was conducted in 1974 using a 
total of twelve ships arranged in tvi?o hexagonal arrays which were exact for the 
first two phases and distorted during the last phase. It is clear from Figure 1, 
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the extent and nature of the distortion. T)tc ships In the inner hexagon were 165 
Km apart while those In the outer hexagon were .665 Kro apart. Tills report Is 
baaed only on data from phases 1 and IT . 

The ships In the array made standard surface synoptic observations on an 
hourly basis ond upper air measurements every six hours, such measurements were 
increased in frequency during periods of precipitation. Many ships made ocean- 
ographic and radiation measurements, collected boundary layer profile data, and 
recorded surface meterological data; however, the data used in this study come 
from the use of two .c-band radars used to obtain special distributions of rain- 
fall not available from other sources. 

The c-band radars were choosen to minimize attenuation problems and maximize 
spatial resolution. The radars were equipped with automatic digital processing 
and recording equipment and the antennas were staV>lllzcd to compensate for possible 
roll and pitch. 

Data was collected at 15 min. Intervals on a 26 hour basis. Antenna tilt 
sequences of 360® scans at a series of 12 Increasing tilt angles were collected 
out to a range of 250 km. The render is referred to Hudlow (1975) for details 
and further discussion on the radar systems along with further references. 

1.3 Approach . 

The distribution of rainfall has been studied extensively from many different 
vantage points and a number of probability models have been developed. Thom (1968), 
Simpson (1972), Johnson and Mlelke and Mielkc (1973) are among the major workers 
in this field. There is no single distribution which best describes rainfall and 
any proposed model may be criticized at a number of levels. In particular, the 
very nature of the diurnal rainfall variation makes the use of any model problemarlc . 
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11)0 approach takcp ''.n thia study is based on the use of three powerful methods 
in mathematical Btutistics. The first two methods are used to accomplish the first 
research objective and pan f the others. We use two non-paramctric statistics, 

the Kolmogorov-Srairnov and the Wilcoxon signed rank. Non-parametric statistics 
are by definl tlon, those s tatlstics that are independent of any particular model 
for the distribution which generates the data, and thus, all conclusions drawn 
arc model ind ep endent. 

Tills approach only depends on the assumption that the underlying distribution 
is of continuous type; that is that measurements take values in a continuous in- 
terval or union of intervals and does not take any specific value with positive 
probability. It is clear that rainfall measurement is not of continuous type, 
and in fact takes the value zero with large probability. For our study, this is 
not a major problem since we are only interested In comparisons conditioned on 
the event of rainfall, thus our approach uses toe method of coi;ditional probabi- 
lity. The measurement of comparisons conditioned on the event of nonzero rain- 
fall is a continuous distribution. For related applications, see McAllister 
(1969) and Gringarten (1970). 

The third method used in this report is based on the assumption that the 
hourly rainfall averages may be viewed as a time series. This approach allows 
us to examine the serial correlation of the hourly rainfall, to detect short 
term cycles, to measure and compare rainfall at different locations and to 
es*-ablish confidence intervals for the mean value of hourly rainfall at different 
locations. We are thus able to accomplish our last two research objectives with- 
in Imposed methodological constraints. 

In the second chapter we diso-.r.s the Kolmogorov-Smirnov and Wilcoxon signed 
Rank statistics and analyze their application to the problem of rainfall variation 
between noon and midnight. In the third chapter, we apply time series methods to 
the hourly rainfall data and provide a detailed analysis of results. In Chapter 4, 
we report results of the diurnal analyses. 



Chapter 2 ; Statleticnl Methodologies 


In this section ve -ilscuss the statistical foundations which ui;dcrpin our 
approach. Since the Uilcoxon test is well documented in the literature and well 
known to applied researchers, we denote a major portion of this section to the 
explication of the Kolomogorov-Smirnov statistic which is less used and less 
known in applied research circles. 

2.1 KolmoRorov-Smtrno v. 

We would like to compare the distribution ' f rainfall and rainfall rate at 
noon with that occuring at midnight. We expect that if there is a diurnal varia- 
tion then chose two distributions may be different. It is possible that ther e 
could be a d l urn al variation and yet the distributions are the same, so that thi s 
assumption is biased in favor of no variation . 

Let (x, , y,),... (x , y ) represent n-observations of rainfall over some 
Li n n 

re,'»ion G. The x's represent the measurement at midnight and the y’s at noon. In 

order to satisfy our assumption of continuity, wo only consider those pairs 

(x, y) for which either x > 0 or y > 0. Let F^(x) be the cumulative distribution 

2 

function for midnight and F (y) be the cumulative distribution for noon. These 

arc the unknown true conditional distributions . We can now state that if there is 

1 2 

no variation, then we expect that F (x) ■ F (y); that is, it is desired to test 
the statistical hypothesis; 

H: F^(y) - F^(y) (2.1) 

verses the alternate hypothesis: 

A: F^(x) i F^(y) (2.2) 

Under this framework, the basic hypothesis (H) is that the distribution of midnight 
rainfall measure is the same as the distribution of noon rainfall measure. Based 
on the data, we would like to confirm or reject this hypothesis. In rejecting H 
we would be making the conclusion that the alternate hypothesis (A) is true. 
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1 2 

Since we don't h/iyc direct access to F (x) and F (y) we construct and analyze 

1 2 

the empirical distributions &nd where n denotes the number of sample 

points. Define e(A) by; 


and set: 


E(X) 


1 if X i 0 
0 If X < 0 


r;|(x) 


1 " 

— I! e (x - X.) 
n j.i j 


F“(y) 


1 " 

- J: e(y - y ) 

j-1 J 


(2.3) 


(2. A) 


Since the superscript identifies the distribution, we shall let X denote the 
independent variable so that equation (2.4) now becomes: 

fUX) “ i j: c(X - X ) 1 - 1,?. (2.5) 

" " j-1 ^ 

where X = x, if 1 = 1 and X, •= y it 1 “ 2. The following theorems show how the 
J J j j 

empirical distributions are related to the true distributions. We interpret 

F^(X) to be the proportion of X. which are less than X. 
n J 


Theorem 2.1 . The expectation and covariance of F^(X) satisfy; 

1) E{F^(X)) - F^(X) (2.6) 

2) Cov {F^(X), F*(X)} = - {nln (F^(X), F^(X)] -- F^(X) F^(X)} (2.7) 

n n n 


Theorem 2.2 

1) (Strong law of large numbers) 

F^(X) -*• F^(X) (with probability 1) 



- 7 - 


2) 11m — i F 


nr*®> 


log log n 


(A) [1 - F^(X)J 


!1) sup 1 f^(A) - F^(X)1 0 (with probability 1) (Cantllli-Glivenko lemma) 


-.«o<A<» 


Proofs of the above results require very delicate analysis and arc presented 
in order to provide us with the relationship between true and empirical distributions, 
the relevant literature includes: Gibman 1953, 1954, Glevenko 1933, Gnedenko and 

Kolmogorv 1933, 1941, and Smirnov 1936, 1937, 1939. 

The next two theorei.is are fundiimental to our testing procedure, define and 
by: (1 - 1, 2) 


' «co<X<c» 


( 2 . 8 ) 


n n 


(2.9) 


Theorem 2.3 

12 i 

1) is Independent of F (A) 


i = 1, 2 


2) 11 

n->oo 




12 -?r 

<y] * 1 - e = 4)(y) for 0 £ y < " 


Proof: We shall obtain the proof of theorem 2,3 as a special case of theorem 2.4. 


Theorem 2.4 . Set c =• |I2ny]l (greatest integer) tl>en: 


e 1 _ 


“ 0 




(2n) 

n 


i Y i 




VI 


Y > 


VF 


Pr[yf ^ < y] 
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Proof: Let us arrange our 2n-obaorvatlonB (x, , y, ) . . . (x , y ) In Increasing 

11 n n 

order of magnitude Xj^, ... Consider the new set of random variables ... ’an- 

defined by: 


Y 


k 


1 if X|^ is a midnight observation 

-1 if X, is a noon observation 
k 


( 2 . 11 ) 


12 

Set S_ » 0 and S. ■ 5^ X, . It is easy to see that nl) " sup S. . Let us note 

^ J-1 ^ " 0<.k<2n ^ 

that ■ 0; if we plot the points (k, for k “ 0,1,. -.2n in the (u,v) plane 
and connect these points by straight-line segments, the v component will increase 
by one unit nt n points (corresponding to rainfall observation at noon) and will 
decrease by one unit at. the remaining n points. 



Figure 2 represents a typical trajectory of the process, the dotted lines 
Indicate that the curve has been broken. Since there are n increases and n de- 
creases, the total number of possible trajectories is furthermore as our 

null hypothesis is that both distributions are the same, this means that all 
trajectories are equally likely. (Another bias in favor of the null hypothesis ) 
Hence, the probability of each trajectory is . In our geometric interpre- 

12 

tation, the required probability satisfying 4 >^(y) “ Prly 2 ^n stated 
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as the probability that “*ie entire trajectory will be below the line u ■ c. Now 

the total number of trajectories which do not cross the line u ■ c Is the same as 

- (the number of trajectories which reach the lino u ■ c) , as there ore n - c 

2n 

possible ways to reach the line u • c, there ore possible distinct trajectories 

whicli reach the line u » c; thus the number of trajcctotlcs which do not reach the 

line u ■ c Is (^*') - ( ) so that: 

n n-c 


♦„(Y) 


O - Cc’ 




- . 1 _ 


n-c n 


We may now provide a proof of theorem 2.3; 

Proof: Let us prove part 2) first; set 

J - <2.12) 


If we use Stirlings formula: 


kl - e"’'(l + 0(k))(^^2i0 


(2.13) 


where 0(k) ->0, k and note that: 


J 


JrUl 


(n + c) I (n - c) 


we Jjnvc: 


J 


O' + c) 


, n+1/2 

_[P 

n+c+J/2 


e'"(l -f 0(n))l 

, .m-c+1/2 

(n - c) 


2 


e 


(n-c+1/2) 


2n+l -2n,, . 
n e (1 + 


0(n)) 


n+cM/2,- . . .n+c+1/2 -(n+c) n-c+l/2\,, . .n-c+1/2 -(n-c) 

n (1 + c/n) e n (1 - c/n) e 


- (1 + (1 - c/n)"^"'‘''^^^^\l + 0(n)) 


(2.U) 
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llBlng the Maclaurln evpinRlon for log J, we got to the flrsn order: 

2 

log J - - ^ + 0(n) (2.15) 

2 

Recall that c ■ |[2nY]I so that — ■ 2y • equation (2.15) becomes: 

~ 2 

log J ■ -2lf + 0(n) so that: 

J “ exp {-2 y^} • (1 + 0(n)) (2.16) 

If we now use equation (2.10) we get 

4* (Y) " 1 - exp {-2 y^} (1 + 0(n)) 4 >(y) “ Hm 4> (Y) " 1 “ exp {-2 y^} 

„-K« " 

12 i 

Let us now prove part 1, we will show that 1)^ Is Independent of F (X) by 

showing that Is Independent of F^(X); It suffices to show this for since the 

2 

same method applies to E 

n 

Proof ; 

Define V(t) as the Inverse function of the event F^(s) <. t this means that 
{s £ v(t)} Is the Inverse event, and this event has probability F^( (t)) “ t, 
hence If we set t ■ F^(s) then E^^ remains unchanged If F^(s) Is replaced by the 
uniform distribution. Our proof Is complete If we recall that given any probability 
distribution, there exists a transformation which transforms It to the uniform 
distribution. Hence any statistic which Is Invariant under the uniform distribution 
Is Invariant under all distributions. 

Returning to our test procedures, two types of possible error may be commit- 
ted In confirming or rejecting the basic hypothesis: type I and type II. Type I 

error occurs If the basis hypothesis H is rejected while It is true and type II 
error occurs if H Is confirmed while it is not true. Type II error is usually 
used to determine the testing a.'proach. Since our approach has been determined by 
other constraints, we shall only concern ourselves with typo I error. 
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Let a bo the probability of type I error, that is: 

a ■ Pr [H is true and H is rejected] 

This number cl is referred to ns the significance level of the test. In this study, 
we choose a be .10. This implies that there is a 1/10 chance of rejecting H while 
H is actually true. This level was determined by the number of data p oints avai l- 

In order to implement our procedure, wo must determine a constant Y such that, 
assuming H is true; 

Pr (I)^^ Yl “ .10 
‘ n 

pr < Yl ” .90 

This gives us a 90% chance of accepting H when H is true. The results of this test 
ore reported in Chapter A. 

2.2 Wilcoxon Signed Rank Test. 

Instead of the alternative hypothesis A in the Kolomogorov-Smirnov test, we 
consider two alternative hypotheses and defined as 

Aj^: The midnight rainfall is (stochastically) greater than the noon rainfall, 

A 2 ; The midnight rainfall is (stochastically) less than the noon rainfall. 

Using the notations in sub-section 2.1, A^^ and A^ can be denoted symbolically as 

A^: and A 2 : F^ < F^. 

Thus, the two testing problems considered in the Wilcoxon signed rank test one. 


H; 

F^ = 


< 

CD 

> 

F^ 

> F^ 

H: 

BS 

F^ 

vs A» : 

F^ 

< F^ 


and 
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Howc«^er, thcao two problcniH are statistically equivalent, respectively, 

to vs Aj^; and H 2 : F^ > F^ vs A 2 : F^ • F^. If Aj^ is not rejected in 

the first problem and A 2 is not rejected in the second problem then we conclude 
that H: F^ " F^ holds. 

As in the K - S teat, there are two types of possible error in each of these 

two problems. Again, we consider only type I errors. To test H vs Aj^, type I 

1 ^ 

error is committed If H (or is rejected while H: F ■ F** is actually true, i.e. 
12 12 

confirming A^^: F while H; F •= F is actually true. The significance level 

la given by 

tti “= Pjj[rejecting U) - P^j [accepting Aj^]. 

1 2 

To test H vs type I error is committed if A 2 : F ^ F is accepted while 
1 2 

11: F “ F is actually true. The significance level is given by 


“2 ‘ jj(*'‘^J‘^^tlng H] «= Pjjtaccepting A2I. 


Note that rejecting H implies accepting Aj^ in the first testing problem H vs Aj^ 
and accepting in the second testing problem H vs A 2 . In this study, we adopted 

”1 “ “2 ” 

Let ,yj^) , . . . , defined in subsection 2.1. The testing pro- 

cedure of Wllcoxon signed test consists of deriving a statistic based on 
(Xi,yi),..., choosing constants and (C 2 • corresponding to 

Oj and a 2 , respectively, such that for testing H vs Aj^ 

rejecting H (or accepting Aj^) if and only If >. Cj^ 
and for testing H vs 

rejecting H (or accepting A 2 ) if and only if ^ € 2 ^ 

For more detail, the reader is referred to Lehmann (1975). 

Note from the above discussion that when H is true, P„[V <. C_] ■ .10, 

n n 4L 


- 13 - 


PutCo < V„ < C, ) " .LC md P„(V i C, ] ■• .10. Thus, when H la true, the chance 
H Z N 1 11 il X 

of concluding that midnight rainfall Is (stochastically) greater than, equal to 
and less than the noon rainfall are, respectively 10 %, 80% and 10 %. 

The result of this test is reported in Chapter 4. 


2.3. Chl-squnre Test. 

Tlic tests describe in subsection 2.1 and 2.2 are applied to the midnight and 

noon rainfall measures at each grid of the GATE data. It was soon that when H is 

12 12 

true, the chances of drawing the conclusion that F ■= F and F / F , are, respectively, 

90% and 10% and that F^ < F^, F^ - F^ and F^ > F^ arc, respectively 10%, 80% and 10%. We 
2 

use X -test to test whether these percentages are actually obtained in the result. 

hot p^ be the probability of making the Jth conclusion, j ■ 1 , 2 ,..., k, where 
k Is the number of different conclusions (k ■* 2 In K-S test and k ■ 3 In Wilcoxon 
test). Let Oj bo the number grids at winch Jth conclusion is made, J ■ 1,2,..., k 
and n = + 1 I 2 +. . .+ the number of grids in the study. Define 


k 

}: • 


np, 


If the given percentages of different conclusions are followed, x^ should be 

2 

relatively small. Thus, if Xj^ is large, we may conclude that the percentages do 

not hold true and should have been otherwise. For instance, in the Wilcoxon test, 

2 

. 10 , P 2 .80 and P^ “ . 10 . X 3 1*5 too large, then the percentages of 10 , 

1 2 

80 and 10% do not hold and thus the hypothesis H: F - F is not true. To determine 

"how large" is "too large", a cutoff constant can be ound corresponding to k and 

2 

the desired significance level of the X -test. With a ■ .05 (i.e. 5% chance of 

making wrong conclusion), C “ 3.84 for k = 2 and C ■* 5.99 for k = 3. 

2 

Results of the X -test are reported in Chapter 4. 



Chapter 3 


Tem|.w«ral Analyslu on Hourly Rainfall Averagea 

The hourly rainfall averages were considered as observations in a time beries. 
Analyses on tl>e time series wei*e undertaken for both Phase I and II data in 
experiment. Properties of such time series at selected locations in the GATE 
experiment were investigated. In the following sections, the objective of the study, 
the data and sampling design as well as the analyses and their results are reported. 

3.1. Objective of Hie Study 
The. objectives of this study are: 

• to examine the serial correlation of the hourly rainfall. 

• To detect the short term cycles in the rainfall activities. 

• To measure and compare the power of rainfall activities at different localities. 

• To establish a confidence interval for the mean of hourly rainfall averages. 

The serial correlation of hourly rainfall averages reveal the linear dependence 
of the rainfall on the rainfall of the preceeding hours. These statistics are 
closely related to the duration of rainfall activities. 

High autocorrelation with large lags implies long duration and vice versa. 

The serial correlation is also related to an assumption in the studies in Chapter 2: 
The diurnal analysis of rainfall. It was implicitly assumed in the study that the 
noontime and midnight rainfall rates are independent. Here we consider the corelation 
between rainfalls several hours apart. While no correlation between rainfalls in 
different hours do not imply independence between them, they may be considered close 
enough for practical purpose. 
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It has been suspected that a two-day cycle exists in oceanic rainfall. One 
objective of the study is either to confirm or disconfirm this conjecture. 

The third objective is to establish a quantity to measure the "ample" of rainfall 
activities . Power is a term in time series analysis designated to measure the amount 
of variation In the series. Since for most of the time there is no rain. A large 
value of this quantity may be a good indication of frequent rainfall or even thunder- 
storm activity. 

3.2. Sampling Design and Data 
3.2.1. Sampling Design 

In order to have unbiased results as well as to account for the variation of 
rainfall in different locations, both ramdomly selected sample and strategically 
selected sample were used in the study. 

The region of the experiment was put into a coordinate system as shown in 
Figure 3.2.1. Each grid is assigned a pair of integers between 0 and 100 as longti- 
tude and latitude coordinates. The grid represents a square of 4 km x Akra, which is 
the spatial resolution for the experiment. 

There are 10,000 (100 x 100) grids in the coordinate system. Rainfall measures 
were consistently observed in 8,128 of these grids, scattering around the center of 
the big square. For this study, a sample of grids is taken for each phase from those 
grids in which the rainfall measure is available. 

For the Phase I data a randomly selected sample of 20 grids were. used. For 
each grid in the sample, we have a time series consisting of hourly rainfall averages. 
Thus, from Phase I data of the experiment, we obtain 20 time series, each represents 
the hourly rainfall averages in one of the selected grids. These 20 selected grids 
are shown in Figure 3.2.1. 


Latitude Coordinate 
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• Phase I 
X Ptiase II 



Figure 3.2.1. Grids Selected for Study in Temporal Analysis. 
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For Phase II data, a strategically selected sample of 16 grids wore used. The 
sampled grids ore also shown in Figure 3.2.1. 

3.2.2. Data and Missing Values. 

Hourly rainfall averages arc computed by adding the rainfall measured in the 
hour and then dividing by the number of rainfall measurements within the hour. This 
is done for each liour in the duration of Phases I and II in the GATE experiment. 

The nature of the study calls for an uninterrupted sequence of hourly rainfall 
averages. It was found that there are hours in both Phases I and II which do not 
have rainfall measurements. This situation usually occured in the earlier stage of 
both pliases. We decided to cut short the time series to accomodate our needs. For 
Phase I, the actual duration usdd for this study is from the 12th hour of 182nd day 
to the 17th hour of the 197th day. (The Phase I experiment lasts from the Otli hour 
of the 179th day to the 1/th hour of the 197th day.) (All Julian hours and days). 

For Phase II the actual duration used is from the 20th hour of the 214th day to the 
21st hour of the 227th day. (The Phase II experiment lasts from the 0th hour of the 
209th day to the 21st hour of the 227 day.) 

In both Phases I and II there still exists an hour without measurement. This is 
filled with method of interpolation. It is expected tliat the effect of this is 
negllgilile. 

We could have used the original rainfall measure of every quarter hour as the 
observati'^n (term) of the time scries, instead of the hourly rainfall average. However, 
there are so many missing values in the quarter-hourly measures throughout the experi- 
ment in both Phase I and II that it is not advisable to use them directly. 



3.3. Auto Correlation 


It ia well known that the hourly rainfall average* of adjacent hour* are 
poBltlvely correlated. That is given that there was an above average rainfall In 
a specific hour, it is very likely that there would be an above average rainfall 
in the next hour, and possibly in the hour thereafter, etc. Auto-correlation 
measurcc the correlation of the hourly rainfall of hours apart. The computation 
formula are given in subsection 3.3.1 and results reported in the subsequent sections. 


3.3.1. Computation Formulns. 

Let Xj,X 2 »...,x^ denote the hourly rainfall averages at a given grid of the study. 
Note that n is the number of hourly rainfall averages in the grid. Define the 
sample mean and sample variance as 


- 1 y 

” t-i ‘ 


i E (x - x)^ 

" t-1 ' 


TliCn the rth auto-correlation of the series is defined to be 

n-r 


R(r) - - i: (x.. - x)(x^_,_„ - x)/S^, r - l,2,...,k. 


- )i: (x - 

n L t+r 


The number r in the formula is called lag of the auto-correlation, and k is the 
maximum number of lugs for the auto-correlation to I computed. The auto-correlation 
“ R(r) considered as a function of r is called an auto-corre l ation function. 

It should be pointed out that, unlike in a random sample (from a population of 
rainfalls), Xj^,X 2 » . • • ,x^ are not independent and identically distributed. The x^ 
denote the t-th hourly rainfall average starting at seme specific hour. The length 
n of time series is 366 for scries in Phase I data and 328 for series in Phase II 


data. 
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3.3.2. Correlogrnma. 

The correlogrnra of a time Berlea is the plot of its auto-correlation function 
R(r) against the lag r. Just as the histogram is studying Rampling problems, the 
corrclogram is descriptive and informative in studying the time scries. It is a 
visual device which is useful to perceive the linear relationship of rainfalls 
several hours apart and to identify the mechanism generating the rainfall series. 
However, it lo not on objective of this study for the latter, l.e, to identify the 
model of the series. 

In the following paragraplis, we make some observations of the correlograms for 
time series in both phases I and II. 

As said earlier, the leng,th of the time series is 366 for phase I and 328 
for phase II. Corresponding to these sample sizes, the approximate 95Z confidence 
Interval for the correlation coefficient of 0 is (- .12, .12). That is, if it is 

hypothesized that the correlation coefficient is zero, the chance of having the 

computed coefficient to be out of Interval (- .12, .12) is 57. or 1 out of 20. The 
5% chance of error is a generally accepted level in practice. Thus, if the auto- 
correlation of a time series at any given lag is in the interval (- .12, .12), we 

may safely assume the value 0 (with 5% chance of being wrong.) 

Examining the correlogram, it is found that there are no 

uniform and specific shape for the correlograms for all time series. The shape 
varies greatly from series to series. The length of these time series may be blamed 
for the non-uniformity in the correlogram. In general, a time series of length 366 
or 328 should be long enough to obtain a meaningful result. But for the time series 
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on hand, thlM may not be true, because there are too many zeros (more than 4 out of 

• $ 

5) for most series) for the hourly rainfall average. It is very likely that Just s 
few hours of heavy or even moderate amount of rainfall at a different time would 
change the shape of the correlogram dramatically. 

Urst consider the correlograms for rainfall series in phase i. 

Even though there are no uniform shape for the correlogram, nine (9) of them 
share the same baf’ic (negative) exponential shape. These are for time sevies at 
grids (5,43), (33,64), (38,7), (40,89), (44,23), (46,18), (49,97), (50,18), and 
(92,70). For most of these time series, the auto-correlatlons of lag 5 or higher are 
small and within the Interval (••.12,. 12), hence may be considered zero with 95% 
confidence. 

Time scries at grids (64,90) and (74,91) have very similar lobed exponential 
shape of correlogram. This may not come as a surprise because these two gride are 
only about 40 km apart. The correlogram for the time series at (21,47) have moder- 
ately high auto-correlation at lags 10,30, and 42. This may be an indication 10 
hours cycle and will be examined in next section (see 4.2). 

Time series at grid (40,89) have consistently large auto-correlation at small to 
moderate lag numbers. This is largely due to a long string of hours with persistent 
rainfall on 183rd and 184th Julian days at the lococion. 

The magnitudes of the auto-correlation for time series at grids (35,55), (37,94), 

(38.7) , (57,23), (65,62) and (67,9) do net seem to depend on the lag number; the magni- 
tudes do not decre.ase as the lag number increase. In particular, those at (37,94), 

(38.7) and (65,62) are almost all negligible at 95% confidence level. 

Next, consider the correlogram in Phase II. 

The correlogram forms essentially a negative exponential curve for the rainfall 
averages at grids (18,34), (50,2), (50,24), (50,66), (50,98), (66,18), (6C 50), (66,82) 
and (82,34). Thus 9 out of 16 correlograms assume this shape. The auto-correlations 
of first order are positive number of moderate ma'^nltudc and decrease exponentially 
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OH the order IncrennoM. 

* • 

For Rome of the time HcrioH, mito-corrolationH are Rlfinlf leant ly non-zero for 
large lagM. For the aerleR r.t grid (2,50), R(38) la .25, and for the aorlea at (18,66), 
R(45) la .18 while R(l) la only .24. At (14,18), the auto-correl at Iona rnlaed for laga 
between 38 and 46 to aa high aa .24. 

At (98,50), the n>rrelogram behavea like a sine wave with peak at laga 15, 28 
and 40. Tlte power apeetral dcnally will be CKamined eloaely If cycles exist. 

3.3.3. Median and confidence Interval. 

Tlie im'dian and its confldeiue l\>terval of tlio aut o-correl at ions are ootalneil and 
reported l«i tlila section. Slnc»' the correlation does not seem to be symmet i lea 1 1 y 
dlstrllnili'il and the mimhiT of correlations avall.ibU* for the study Is limited (20 for 
phase 1 aiul 16 for phase IT) we use non-paramet r Ic approacli Instead of the more conven- 
tional one where normal (hall shape) dlstrihutlon are assumed. 

Let ^ '^ 2 — • • • auto-cor la- 1 at Ion (of any given lag) orderial In 

Increasing, order. In I’hase I, n “ 20 where the 25th and 75th percentiles are y^^ and 
^Ih i hi* medium m = ^ ^11^’ I’hase IT, n 16, the 25th and 

75lh periint I les are y^ and yj^. respectively and the median m ■= 2 Yg) • 

It is well kn^)^^m (see e.g. 133 1, p. 181) tliat for 0 1 . 1 .1 J 1 . n 

.n. ,1 . X ,1 .n-x 

Flyj < «' < y.l “ ^2^ 

‘ 1 

Tlius (y^.Vj) Is a lOOr % confidence for m, the median. hy using, .1 binomial table 
(e.g, 133]) it is foui.ii that for n = 20 (phase l) 

Mv, < m <■ y., 1 “ .9941 -.0059 - .9882 - .99 
5 16 ’ 

and for n “ 16 (Phase IT) 

Ply^ < m < y^,jl - .9788 « .98. 
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Hence (y^, is ® 99% confidence Interval for the medium of the auto-correlatlon 

with any given lag In Phase 1 and (y^, a 98% confidence Interval in Phase II. 

The chances of error are 1% and 2% In Phases I and II, respectively, in saying that 
the medium is In the interval. 

Table 3.3.1 and Table 3.3.2 summarize the means, medians, 25th and 75th percent- 
iles, ranges, lengths of confidence Intervals of the auto-correlatlons for lags 1 
through 12 In Phases I and II data, respectively. The range is defined to be the 
maximum of correlations minus minimum of the same. Figure 3.3.1 and Figure 3.3.2 
show the medium, 25th and 75th percentile along with confidence Interval for the 
medium in Phases 1 and II data respectively. 

It is seen that in .both Figure 3.3.1 and Figure 3.3.2, the median with lag 1 is 
moderate (in AO's) and there is a considerable drop between lags 1 and 2. The mediums 
with lags 2 and 3 are about the same magnitude. Those with lag 5 or higher are so 
small that they are negligible. In fact, their confidence intervals contain 0 for 
most of them. Thus, the hypotViesis that median is 0 would not have been rejected. 

3.3.4. Concluding Remarks. 

The auto-correlatlon function for each time series in both Phases I and II 
were studied. The length of series is 366 in Phase I and 328 in Phase II. Since 
most (more than 80%) of the rainfall are 0, the auto-correlatlon function seem un- 
stable in sense that the shape of the function are ulte different from series to 
series. However, 9 out of 20 in Phase 1 and 9 out of 16 in Phase XI are essentially 
negative exponential curves. For these series, the auto-correlation with lag 1 
ranges from .46 to .82 in Phase I and from .34 to .68 in Phase II. The value decrease 
exponentially as the lag increase. Three of the correlation functioits show seme re- 
peating peaks and valleys. These series may have short term cycles and will be 
examined for such in the next section. At lag 1, the median of the auto-correlation 
is .46 for phase I series and .41 for phase II series. At lags 2 and 3, the values 
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for the nuto-corri*l«r Ion drop consldorably for both phases I and II. Hie values 
arc not sl{;n If leant for laR larj’cr than or equal to 5. Thus wo may say with caution 
that rainfalls of 5 or nmre hours apart arc uncorrclated. This statement Is not 
always true. There are occasions tliat the auto-correlation with larger lag Is signi- 
ficantly non-zero. 

3. A. Power Sped rum 

Tie power si>ectrum is useful tool to detect a cycle In tlu* time scries. If the 
power at a frequency is large comparing to tlte powers of nelgiihorlng frequencies, 
then we may conclude that there is a cycle In the time series corresponding to the 
frcqueacy . 

In tilts section, we use power spectrum estimate of hourly rainf.tll averages to 
detect sliort term cycles. We start witli a brief review of the metliology in subsection 
3.A.1, Tlie analysis and results are reported in subsection 3.4.2. 

3.4.1. Power spectrum of the. t ime series. 

To facilitate the Int eriMi'tat i on of the results of spectral analysis an over- 
siinpli t il'd version of time serii's representation and Its analysis is presented here. 
This version is inti'nded for reaik'is who liave no background in time scries analysis 
and want to get hold of some conceptual meaning of the »"esult to be reported in tlie 
following subsection. Readers with background in spectral analysis of time scries 
may skiji this subficction and proceed to subsection 3.4.2. 

We assume that a time series x(t) may be written .as 

00 

x(t)=> E z e^"^*" 

j— oo J 

whore 1 = Xq ■=> 0, X j = -Xj and z ^ Zj , the z's are random variables (with 

EfZjl •= 0 for J f 0). The restrictions on the X^ and z's imply that x(t) is a real- 
valued random variable for eacli t, the time. The representation of x(t) says that 



Table 3.3.1 Mean, median and other statistics of Auto-correlations in Phase 
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Table 3.3.2 Mean. Median and Other Statistics o f Auto-Correlations in Phase I I 
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Fig. 3.3.1 Median, its confidence Interval, and 25th and 75th percentiles, of Auto-Correlatlons in Phase 
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tlic scricR cnn be decomposed Into contribution of harmonic at frequency Xj or that 
the scries is the superimposed sum of harmonics at frequency Xj . 

It can be shown then that the power of the time scries can be written as 


power of x(t) " 3^ [power of x(t) at frequency X ]. 

J ^ 

The power of x(t) at frequency X^ measure the amount of variation, or intensity, of 

the series at frequency . In particular if Cj, a fixed constant (i.e. not a 

2 

chance variable), tlien tlie power j * Note that | c^ | is the magnitude of 


the 


term c^e^^'^*' in. the representation. (i.e. |cj| ■= jc^ ' ) . When is a 


i/jt 


chance variable, the power of x(t) at frequency is the variance of z^ , E[Zj], and 
the power of x(t) is the total variance at all frequencies, which is also the variance 
of ttie ensemble x(t), t = 1,2 N. 

The power spectrum of the tiim* series describes the distribution of the total 
power of x(t) at different frequencies. Define the power of x(t) at frequency X^ 
as a function of X^ . The function so defined is the power spectrum of the series 
x(t). At freciuency X the value of this i unction measures the contribution of tlie 

IX t ^ 

harmonic e ^ to the total power of x(t). If the contrllnit Ion at the frequency X^ 
is large, comparing to tlie neighboring frequency, then there is a cycle imbedded in 
tlie time series x(t) at frequency X^ . It is understood tliat the pow(’r spectv.m, or 
spectral analysis in genera], is useful in other respects such as model building, 
prediction, filtering and control simulation and optimization, etc. We shall restrict 
our study to explore the short term cycles of the series. Vor detail discussion, 
the interested readers are referred to Jenkins and Watts [37] and Koopmans [42]. 


3.4.2. Power spectrum estimate. 

The power spectrum estimate is computed using FT-FRICQ subroutine of the Inter- 
national Mathematical and Statistical libraries (TMSL) . Due to the limited accessi- 
bility of the IMSL to the authors, only preliminary exploratory study of the analysis 
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ifl undertaken. The study Is restricted to explore the short term cycles in the 
data. It is noted that the length of the time series is short for detecting cycles 
366 for series in Phase 1 and 328 in Phase II. 

In Phase I, the series at grid (21, A/) show strongly a 10 hour-cycle, at grid 
(53,35) a 25 hour cycle, at grid (6A,90) and (65,62) a 12.5 hour cycle, and at 
(74,91) a 15 hour cycle. 

In Phase IT, the series at (98,50) shows a 13 hour cycle, and a 5 hour cycle, 
at (82,34) a slight 50 hour cycle, at (66,82) a 100 hour cycle, at (66,50) a slight 
33 hour cycle, at (50,98) a 33 hour cycle, at (50,2), 14, 7 and 5 hour cycles, at 
(34,38) a 50 hour cycle and at (18,34) a 50 hour cycle. 

From the observation in the last two paragraphs, it does not seem to have 
dominant cycle prevail to all time scries. A cycle of 12-13 hours is observed in 
two of Plwise I series and one of Phase II series. A cycle of 50 hours (or more 
likely 2 days) is observed in three series in Phase II. 

3.5. Distribution of Total Power 

It was observed in the last section that the power varies wildly among series 
This is true for both total power and power at all frequencies. To some extent, 
the power measure the "amount" of rainfall activities at a specific frequency. The 
total power is in fact the variance of the hourly rainfall averages (the ensemble). 
Since most of the hourly rainfall averages arc zero, large variance would indicate 
a large of rainfall from time to time or maybe frequent thunderstorm activities. 

Figures 3.5.1 and 3.5.2 show the plot of the total powers of times series at 
selected grids In Phase 1 and Phase IT, respectively. Tables 3.5.1 and 3.5.2 list 
the values. 

It is observed that the values of the total power of the series in Phase I 
vary from .002 to 1.589. This variation is dramatical considering that there are 
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366 observations involved. The value of 1.589 occured at grid (65.62). It is 
found that at this grid there was cyvceptionally largu amount (23.71 cm/hour) of 
rainfall at one time. At grids (37,97) and (53,35) the total powers are .187 
and .131, respectively. These values are also considerably large comparing to 
the rest. 

The total power of scries in Phase II r.nnges from .002 to .050. The largest 
value is 25 times of the smallest values. The ratio is moderate if one notes that 
the corresjuindlng ratio in Phase I is 9,349. Thus the rainfall activities were 
somewhat similar among all localities during Piiase II of tl>e experiment while they 
were dramatically different during Phase I. 
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Figure 3.5.2. Total Power of Hourly Rainfall Averages in Phase II at 
Selected Grids. 



Figure 3.6.1. The Mean of Hourly Rainfall and its 95% Confidence Interval in 
Phase I at Selected Grids. 
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Figure 3.6.2. The Mean of Hourly Rainfall and its 95% Confidence Interval 
in Phase II at Selected Grids. 
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3t6. Rainfall Average 

The moan of the hourly rainfall averages and its 95Z confidence interval were 
lieted in Tables 3.5il and 3t5.2. Although the mean of averages should be closer 
to normal distribution than the mean of raw rainfall, the distribution of the mean 
of averages Is far from being normal. In addition, the hourly rainfall averages 
ore not independent cither, as was observed in section 3 of tbis chapter. There- 
fore Tables 3.5.1 and 3.5.2 should be viewed with cmitions. 

Figures 3.6.1 and 3.6.2 display the mean, along with its 95Z confidence 
Interval, of the hourly averages, according to the location of observations. It 
is interesting to note that the rainfall distribution in Phase 11 is very even among 
all locations. The m:igt)ltude of the mean and the length of its confidence interval 
are comparable. They vary very little from location to location. But in Phase 1, 
the sU)ry is completely different. The value of the mean ranges from .OOIA cm/hr. 
to .1320 cm/br; the latter is times of the former. The length of the confidence 
interval varies dramatically from location location also. 



Table 3.5.1 Mean and Variance of Hourly Rainfall Averages in Pnase 
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Table 3.5.2 Mean and Variance of Hourly Rainfall Averages in Phase II 
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Chaptcr 4 ; Results on Dluninl Analysis 

A.l Introduction . 

In this part of our report, the results of the Kolmogorov-Smirnov, Wilcoxon 

Signed Rank and the Chi-square Goodness of fit tests are reported as they were 

applied to the noon and midnight average rainfall rates at each grid (A square 

Km) of the 160,000 square Km array of GATE. 

Special programs were written to perform the analysis utilizing standard 

subprograms from S.S.P. One program prints the results of both tests in a tabular 

form while two other programs were designed to present the output of the results 

plctorially us a 100 x 100 cartesian array of nlpha-numorlc sjmbols, each repre- 

« 

seating the outcome of the statistical procedures indicated above. The grophical 
approach makes it easy to detect clustering and/or periodic behavior in any region 
of the array. 

Data condi oned on the event of rain for noon or midnight were combined 
for the first two phases of GATE to produce a temporal resolution of 35 days at 

each point of the 100 x 100 array. Noon and midnight rate vectors of length 35 

were generated. 

Fifteen minute instaiitaneous radar precipitation data for phases one and two 
were used in the study. To minimize the effects of missing data, there was 107 

records used from phase one and 188 used from phase two. The noon (12th hour) 

and midnight (o th hour) rainfall rates were obtained by taking averages of all 
data both one hour before and one hour after noon and midnight. Thus the neighbor- 
hood about both noon and midnight was a one hour redius. 
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4.2 Correlation 

In Chapter 3, a detailed analyals of the time-serloB approach to the queation 
of temporal correlations was presented. Recall that a major (implicit) asaumptlon 
of the Kolmogorov-Smlrnov and the Wllcoxon Signed Rank tests Is that the noon and 
midnight data are independent. A fundamental conclusion from the temporal analysis, 
FIr. 3.3.1 and 3.3.1!, (pages 26 and 27) Is that with a 98% confidence level, rainfall 
5 or more hot i rs apar t a rc uncorrolated . 

It is well known that uncorrelated data need not be Independent, however from 
a practical point of view, this Is all that can be expected. In a more technical 
vain, there can be no difference between the two concepts unless we arc a priori 

given the Joint distribution function, which Is a part of tl>e unknown information 

• ^ 

in this study. 

4. 3 Kolmogorov-Sinlrno v Tes t 

A major objective was to determine if there is any mathematical difference 
in the empirical distribution of rainfall at noun verses midnight. This is 
equivalent to our test of the null hypothesis that noon and midniglit distributions 
are tlie same verses the alternate hyopthesis that they are different. 

In this study we surveyed 10,000 grid points; 1,872 were excluded because 
tl>ey had too few rainfall events for analysis; 1,742 had noon and midnight data 
distributed in such a way that we could not reach any conclusion. In 5,425 grid 
areas, we found that the null hypothesis could be accepted and in 960 grid areas 
we found that the null hypothesis must be rejected and the alternate hypothesis 
accepted . 

Figure 4.2.1 displays the results pictorlally in a 100 x 100 array. The 
letter D indicates that there is a significant difference between noon and midnight 
data. (reject mill hypothesis) A blank indicates that there is no difference. It 
is easy to see that zero rainfall rates in both phases dominate the western border 
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of the arr.iy. Heavy clustering of D's can be detected in the north eastern and 
south weatern region of the array. We expect that this is duo to the occurancc 
of heavy rain In thcnc regions throughout the thirty five day period of phase 1 
and 2 . This also implies that there may he a periodic spatial distribution of 
nreas whor e there Is a dennlte diu rnal rainfall variation surroun d ed by regions 
where there Is no variatio n. 

The Wllcnxon Signed Rank Test 

The Wllcoxon test, allows us to test the null hypothesis that the mean distri- 
bution of rainfall rate for noon and midnight are the same verses the alternative 
hypothesis that one Is greater than the other. 

In this study, 1,872 grid areas were excluded because they had too few events for 
analysis; for i,743 wc could reach no conclusion. In 4,890 grid areas we found that 
the null l iyp othesls could be accepted whil e in 1,495 of the grid areas we found that 
the null hypothesis must be rejected and th e al ternate hypothesis accepted . 

This result is expected since the Wllcoxon test is less conservative than the 
Kolmogorov-Smirnov test. It is possible for the distributions of rainfall at noon 
and mldnig.ht to be the same and yet the means are different. 

Figure A. 3.1 displays the results plctorlally In a 100 x 100 array. The letter 
L indicates that noon rainfall is significantly less than midnight rainfall, the 
letter G indicates that noon rainfall is sigjilf icantly greater tlmn midnight and a 
blank indicates no difference. 

The results of Wllcoxon signed-rank test match those of Kolmogorov-Smimov 
test, as Car as whether there is difference in the rainfall distribution at noon 
and in the midnight is concerned. From Figure 4.4.1, it is observed that the rain- 
fall activities at noon and in the midnight during the experiment period are rare 
in the western and north-western parts of the area, as indicated by llte noon 

rainfall is greater than the midnight rainfall in the southern and northeastern 
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parts, oa Indicated by "G". The placea where midnight rainfall Is greater are 
scattering. 


4,5 Chl-Squarc Test 

The chl-dquaro fost la used to provide a further check on the results of 

the Kolmogorov-Sinlrnov and Wllcoxon signed rank tests. This test allows us to 

determine whether or not the percentages used for analysis using the other two 

statistics were actually preserved in the results of the analysis. In this case 
2 

the X -test is defined by: (see Chapter 2) 

, k (n, - nP.)^ 

. j. __J J — 

^k nP, 

i 

k 

where n “ J] n . ; n. is the number of grids for which the j-th conclusion was made 


n=l 


j’ J 


Pj is the probability of making the n-th conclusion. 

We use an a “ .0!) for the level of significance for each experiment, 


2 

The y test appl led to the K-S experiment. 

In the K-S test we have the following information: 

k - 2 j - 1,2. 

“ 5425 - .90 P 2 - .10 

n^ “ 960 

n “ + n 2 “ 6385 


‘'.05,2 


3.84 


2 (5425 - 6385 (.90))^ . (960 - 6385 (.10))^ 

^2 “ 6385 (.90) ■ 6385 (.10) 

2 2 

“ + VoQ - ’ c “ 17.99 + 161.88 * 179.87 > 3.84 

o3o* J 


hence, the percentages of 80 and 20 do not hold at the 5% level of slgnlflc.mcc 

1 2 

and thus the null hypothesis F ■ F is not true and must be rejected . 



42 " 


2 

Thu V Tcet applied to the Wllcoxon Kxperiroent . 

In the Wllcoxon experiment we have the following information: 


k 

"l 


*^.05,3 


X 


2 

3 


3 J - 1,2,3. 

3 

430 02 - 4890 n^ - 1065 n - E n^ - 6385 

.10 ?2 - .80 Pj - .10 


5.99 

[430 - (638 5) (. 10)]^ [48 90 - (638 5) (.80) [1065 - 6485 (.10)]^ 

6385 (.10) ' 6385 (.80) 6385 (.10) 


. a. f 426. 51^ „ 

639 5108 639 


68.03 + 9.30 + 284.67 


362.00 > 5.99 


Thun, the porcentaRcs of* 10, 80; ,nnd 10% do not hold nt the 5% level of significance 

1 2 

an ti h ence Hio null hypothosls 11^: F ■ F is not true and must be rejected . 


4.6. Summary on Diurnal Analyses. 

Two non-pa>‘ametrlc tests were used to detect if there is a diurnal variation 
in oceanic rainfall using GATE data. Non-parametric methods are chosen because of 
their model free characteristic. 

The test was undertaken at grid points \^here data are available. Of the 10,000 
grid points in the study area, 1872 were excluded because data are not available; 
those points are around the boundary of the square. In addition, there are 1743 
grid points where no conclusion was obtained due to insufficient frequency of rain 
in the midnight and at noon during the experiment period. The analysis was done 
at the rest of 6385 grid points. 

Tlie Kolmogorov-Gmirnov test was used to test whether there is a difference in 
the rainfall distributions between noon and midnight. At one out of 10 chance of 
error, it was found that out of 6385 grid points, the rainfall distributions at 
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noon and at midnight are different in 960 grid points and are same at 5, 4125 grid 
points, which arc 15.0 and 85.0%, respectively. If there were no difference in 
the rainfall distributionB, these percentages would have been 10 and 90% respec- 
tively. A cnl-squarc teat at one out of 20 chance of error shows that this split of 
percentage was not followed. There are more grid points where the assertion of 
difference is made than expected. Thus, overall the rainfall distributions at 
noon and at midnight arc different. This concludlon could have been made at 
practically any significant level. 

The Wllcoxon stgned-rank tost was used to test if the noon rainfall is less 
than, equal to, or greater than the midnight rainfall. It was found that, at one 
out of 10 clmncu of error, the numbers of grid points fall in these three categories 
are, respectively, A30, A, 890, and 1,065. In terms of percentage, they are 6.73, 
76.5, and 16.68%, as compared to 10, 80, and 10%, respectively, which are expected 
if the assumption of no difference had been true. At one out of 20 chance of error, 
the chi-squnre test concluded that the midnight rainfall is not equal to the noon 
rainfall; the noon rainfall is convincingly greater than the midnight rainfall. 

This conclusion could have been made at practically any significantly level. 

From the analysis it was also found that the Wllcoxon signed-rank test is more 
sensitive than the Xolmogorov-Smlrnov test in detecting the difference in the mid- 
night and noon rainfalls. This is to be expected by the nature of these two tests. 
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